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Abstract. - We have developed a generalized semi-analytic approach for efficiently computing 
cyclization and looping J factors of DNA under arbitrary binding constraints. Many biological 
systems involving DNA-protein interactions impose precise boundary conditions on DNA, which 
necessitates a treatment beyond the Shimada-Yamakawa model for ring cyclization. Our model 
allows for DNA to be treated as a heteropolymer with sequence-dependent intrinsic curvature and 
stiffness. In this framework, we independently compute enthlapic and entropic contributions to 
the J factor and show that even at small length scales (~ £ p ) entropic effects are significant. We 
propose a simple analytic formula to describe our numerical results for a homogenous DNA in 
planar loops, which can be used to predict experimental cyclization and loop formation rates as 
a function of loop size and binding geometry. We also introduce an effective torsional persistence 
length that describes the coupling between twist and bending of DNA when looped. 
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Introduction. — Calculating the probability that 
contact will occur between two distant ends of a poly- 
mer under prescribed orientations is a long-standing ques- 
tion of considerable significance in polymer physics. This 
problem was rigorously defined in the context of polyelec- 
trolyte condensation as the ratio of equilibrium constants 
for cyclization and bimolecular association by introduc- 
tion of the Jacobson Stockmayer (J) factor [1]. Yamakawa 
and Stockmayer expanded on this work using the Kratky- 
Porod wormlike chain model (WLC) to compute the J 
factor of angle-independent DNA ring-closure probabili- 
ties [2] . Shimada and Yamakawa then included twist align- 
ment of the end points [3], known as phasing, to explain 
the measured oscillatory cyclization rates by Shore and 
Baldwin on DNA shorter than 500 base pairs [4]. Shi- 
mada and Yamakawa calculated the J factor for the Ring 
and and unconstrained loop, by treating DNA as a homo- 
polymer with coincident end points and parallel tangent 
vectors, as well as with coincident end points with un- 
constrained tangent vectors, respectively, see fig. [TJ Our 
work generalizes this closure probability to include arbi- 
trary end point locations, binding orientations, sequence- 
dependent curvature and elasticity. 

We numerically calculate J factors based on a semi- 
analytic formulation that includes specified end point loca- 



tions and orientations of the DNA. This formulation goes 
beyond the homogeneous straight elastic rod [5-7] by in- 
cluding as inputs intrinsic curvature and stiffness based 
upon sequence-dependent effects. While Monte Carlo 
methods have been successfully used to compute J fac- 
tors [8,9], in general, it is difficult to separate out the 
individual effects of curvature and stiffness given they are 
very computationally taxing; by contrast, our computa- 
tion of the J factor based on a desired equilibrium shape 
takes only minutes on a MacBook Pro. 

Our model independently calculates enthalpic and en- 
tropic contributions to the free energy of the DNA loop. 
The numerical results show that boundary-condition dom- 
inated entropic contributions are important even for very 
short DNA on the order a persistence length (£ p ). Within 
a cell, DNA is normally constrained by histones and other 
binding constraints, leaving this length scale as the typical 
size of locally fluctuating DNA. 

Many DNA-binding proteins impose very specific 
boundary conditions on DNA loop formation. Previous 
results have shown that boundary condition constraints on 
the DNA end points play a significant role in the facilita- 
tion of loop formation [10]. Boundary conditions have also 
been suggested by Tkachenko [11] as an explanation for 
the striking disagreements between the cyclization rates 
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Fig. 1: Representation of the local basis vectors along the DNA 
in the open and looped states, respectively. In the looped state 
we prescribe the end point locations through a set of spatial 
coordinates (x, y, z) and angles (<d, $, 'J) between end point 
tangent vectors. Note the directions of fi\ , fi2 are determined 
by the open state body fixed frame, rather than the looped 
state space-curve. Also note that the circular Ring corresponds 
to O = $ = 0. Phasing of the two end points is represented by 
\P, and is often due to a mismatch in the helical repeat of 10.5 
base pairs. 



measured by Du et al. [12,13] and Cloutier et al. [14,15]. 
Therefore, any useful model for these interactions must 
accommodate such arbitrary boundary conditions. Thus, 
the J factor framework gives quantitative insights into the 
mechanics of protein-mediated DNA loop formation and 
is important for multi-scale models of larger DNA-protein 
assemblies such as chromatin and nucleosomes. 



Theory. — To model the enthalpic and entropic con- 
tributions to the looping J factor, we use a coarse-grained 
elastic rod model for the DNA polymer to calculate the 
Hamiltonians describing thermal fluctuations about the 
open (H°) and looped {H l ) states. The J factor is then 
calculated by comparing the probability densities of find- 
ing the DNA in a configuration that corresponds to the 
looped state with enforced boundary conditions for the 
end points, described by three angles (9, <£>, 'J) and three 
positions (x,y,z) (see fig. [TJ, to that of the open state 



without such constraints 

_ 8^J[d^]e^ H '^SHu(L))6(0 1 (L))S(9 2 (L))S(^L)) 
f[d£de-f>n°te) 

(1) 

The integration is over the amplitudes d£i of the normal 
modes with eigenvalue Xi of the respective Hamiltonians, 
and = 1/ (fc^T), the inverse product of the Boltzmann 
constant ks 1 and temperature T. The DNA is parameter- 
ized by arc length parameter s, where s = and s = L 
are taken to be the end points. The endpoint tangent 
vectors have three angular constraints 0i(L),02(L),ip(L) 
and a relative displacement vector u which are imposed 
by S 3 (u), 6(61,2) and 6(ip), respectively. 

The open state is characterized by three input local cur- 
vature components k° = (k°(s), ^(s), t°(s)), which rep- 
resent intrinsic curvature caused by sequence-dependence. 
We include as inputs two bending persistence lengths 
£i(s),l2{s), corresponding to bending elasticity along the 
major and minor grooves of DNA, respectively, as well as 
a torsional persistence length £ T (s). 

The three equilibrium looped state curvature compo- 
nents k £ — (k\ (s), k\(s), t^(s)) are found by minimizing 
the strain energy of DNA under specified orientations and 
positions of the endpoint tangent vectors, while track- 
ing the DNA cross sections, as demonstrated by Goyal 
et al. [16]. This tracking allows the use of the body fixed 
vectors (t(s),hi(s), n 2 (s)) for a basis to define angular de- 
formations (0i(s),02(s),ip(s)) from the equilibrium open 
and looped states. The angles 6*1,2(3) are defined as rota- 
tions about the two open state normal vectors and ip(s) is 
defined as a rotation about the tangent vector of the open 
state. 

The deformation induced curvatures k°' = 
(K°' i (s),K2' i (s),f°' i (s)) are computed separately for the 
open (k°) and looped (k l ) states, respectively. The looped 
state Hamiltonian is expressed as H e = E e + 6H(0i, 02,ip)t 
where E e is the strain or enthalpic energetic cost of loop 
formation. The open state equilibrium is the intrinsic 
curvature induced by sequence-dependence. As DNA is in 
an aqueous solution, its mobility is severely overdamped 
and the kinetic energy contributions to the Hamiltonian 
are neglected. The deformation Hamiltonian is then 

/3ff°^=i^ ds(k°' e -k°) T B(s)(k°> e -k°) (2) 

where B(s) is the stiffness tensor, which we take to be di- 
agonal with components £i(s),£ 2 (s) and £ T (s). The curva- 
tures components R in eq.[2]are organized into three groups 
based on their order of deformation variables (0i,02,'>p)- 
The zeroth order terms represent strain energy of loop 
formation. The first order terms define the equilibrium 
conditions, and will vanish. The second order terms de- 
termine the normal modes, while the higher order terms 
are neglected. The equilibrium planar deformation curva- 
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tures that we will work with in this paper are 

(«? + kl) = (4 + 2k 2 6' 2 ) + ({9? - K%el) + e'i) 



f 2 



K 2 Vl 



(3) 
(4) 



Here we have assumed the DNA to be isotropic in bending 
stiffness, l p = l\ = l 2 , and intrinsically straight. The 
curvature components contained in the Hamiltonian are 
made non-dimensional by scaling with the overall DNA 
length, L. 

When constructing the Hamiltonian, the Galerkin 
method is used to numerically solve for the normal modes 
of the open and looped states. Each deformation vari- 
able (0i(s), 02 (s), i>{s)) is expanded in terms of N or- 
thogonal comparison functions, which are then used to 
create a 3N x 3A Hamiltonian matrix for the open 
TL° and the looped TL l states. The comparison func- 
tions satisfy the angular boundary constraints imposed 
by S(e 1 (L)),S(6 2 {L)),S(ip(L)) in eq. [TJ The remaining 
looped boundary condition 5 3 (u) are satisfied by Fourier 
expanding the delta functions, and then integrating over 
the eigenvector amplitudes £j, leading to the constraint 
matrix V. An additional integration for the open state is 
required to cover the modes which cause displacements of 
the end points. The J factor can then be expressed as 




detH° 



li 



1% V 27r 3 det?^dety \ L 



(5) 



which is a product of two functions, one describing the 
entropic contributions, henceforth referred to as the en- 
tropic coefficient A(0), and the exponential term contain- 
ing the enthalpic contributions. The lowest eigenmodes of 
the Hamiltonians converge in the limit of large N. The 
ratio of Hamiltonian determinants is finite because higher 
spatial frequency modes are less sensitive to the curva- 
ture of the shape. After the M th eigenmode, the ratio of 
eigenvalues converges 



detH° _ A? •• 
det H e 



(6) 



The ratio of eigenvalues describes how the space accessible 
to thermal fluctuations of the DNA is reduced upon loop 
formation, which in turn allows us to quantify the entropic 
change of the system. 

Results. — The results presented here are for near pla- 
nar DNA loops with coincident end points and arbitrary 
loop tangent angle 9. In this paper we present results for 
DNA of the length 50nm or approximately 14 helical re- 
peats, so we will assume l\ = l 2 . We treat the DNA as a 
homogeneous polymer with bending and torsional persis- 
tence lengths of 50 nm and 75 nm, respectively. [18-20]. 
While DNA is a heteropolymer with anisotropic bending 
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Fig. 2: The entropic coefficient A(O) is largely dominated by 
contributions from the lowest eigenmode of the loop, Ai(S). 
To illustrate this dependence, we write A(O) = /(Af (0))7(0), 
where f(Xi) contains only the contributions of the lowest eigen- 
mode \{. The function f(\{) is represented by the bracketed 
quantity in eq.[7] and is given in eq.[8] It is then clear that 
is a slowly varying function on the interval (Q = 0) to 
(Q = 0.547t), and then steadily increases on the interval from 
B = 0.54-7T to Hairpin (O — n). The shift in behavior of ~f(0) 
occurs after the lowest eigenmode changes from symmetric to 
antisymmetric. Even for relatively short DNA A(Q) is shown 
to elfect the J factor by an order of magnitude in fig. [3] Note 
the dimensions are in molarity rather than concentration as in 
eq.0 



persistence lengths ii(s) ^ ^(s), this anisotropy largely 
averages out after a few helical repeats ( 10.5 base pairs) 
as demonstrated by Kehrbaum and Maddocks [17]. 

The entropic coefficient A(9) is computed for a torsion- 
ally unconstrained DNA loop with overall length L — £ p 
and loop tangent angles ranging from a Ring 9 = 0, to 
a Teardrop 9 ~ 0.54-7T, to a Hairpin 9 = it and is given 
in fig. O The lowest eigenvalue of the in-plane loops can 
be well approximated as X{ = 2ir®. Factoring this contri- 
bution from A(9) reveals a slowly varying function 7(9). 
We are able to fit our numerical results to within 1% by 
using a modified Bessel Function 



J(9) = [7o(2^9)e- 2 - e ] 7 (6) 



1 ft 



11/2 



x exp I -^E(Q) - A 



L 



At 



(7) 

7 (6) =3659 2 - 5259 + 32tt 3 , (8) 
E{&) =^ J K 2 ds = 2.02(9 - 0.54tt) 2 + 14.05, (9) 

where 9 is the loop tangent angle in radians. The fit is 
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Fig. 3: - A comparison of entropic coefficient effects on the 
J factors computed using only enthalpic considerations versus 
a full treatment of enthalpic and entropic considerations. In 
this way we demonstrate via several orders of magnitude dif- 
ference that the entropic changes are vital to the calculation of 
the J factor. Small angles and Hairpin structures are poorly 
described by the enthalpic only extrapolations of the J factor. 
These results are for DNA of length 50 nm and increase in 
difference as length is increased. 



accurate for all angles 9 although the dimensional scaling 
of eq. [7] needs to be modified to {l v jVf when 6 = 0, as 
the ring has a zero mode [3]. The unconstrained loop by 
contrast has dimensional scaling of (l p /L) , due to inte- 
grating over the orientations of the tangent vectors. 

The Teardrop shape has the lowest strain (enthalpic) 
energy, 14.4-^ fc^T, of any of the in-plane shapes, and 
is where the endpoint curvatures of the loop vanish. En- 
thalpic considerations demonstrate which loop tangent an- 
gle will produce the maximum J factor, although for 
small angles, as well Hairpin structures, entropic consid- 
erations are required to demonstrate the absolute behavior 
of the J factor, as seen in fig. [3l 

Shimada and Yamakawa provided two special cases for 
the in-plane J factors [3] : the Ring defined by aligned tan- 
gents, = 0, and the unconstrained loop. We have repro- 
duced the Ring and unconstrained results to within 0.01%. 
The unconstrained loop formation assumes that E(Q) 
given above in eq. [9] is symmetric about the Teardrop. As 
most biologically relevant cases do not fit neatly into one 
of these special cases, our generalized results allow a more 
accurate prediction of the J factor. To illustrate the ef- 
fect of angular dependence on the J factor, we plot three 
J factors with different entropic coefficients A in fig. [3] 
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Fig. 4: (a) The effective torsional persistence length, I* in 
units of £ p as a function of loop formation angle, O. The Ring 
has pure torsional modes with stiffness Xi -f and as O increase 
the bending and torsional modes become coupled, reducing the 
effective torsional persistence length, (b) The torsion-bending 
coupling a(<d) shown as open circles, is quadratic from the Ring 
to the Teardrop as seen by the dashed line. From the Teardrop 
to the Hairpin, a(<d) is cubic in O. 



Two of these J factors have constant entropic coefficients 
A ^ A(0), that of the Ring and unconstrained loop, while 
allowing the normal angular dependence of the enthalpic 
contributions E(Q). Thus fig. [3] demonstrates that the 
changes to the entropic contributions as a function of 
are critical in the J factor calculation. 

The general DNA-protein complex has spatially sepa- 
rated end points as well as prescribed angles (0, $, \&) 
which can be obtained from DNA-protein co-crystals with 
LacI protein serving as the canonical example [21]. The 
extrapolation from the orientation averaged loop is better 
than the ring, except for small angles 0. For all angles, 
the Bessel function given in eq. [7] is an excellent fit, with 
a maximum error of less than 1% for all 0. 

Computing the torsionally constrained J factor by in- 
cluding 5(tp) in eq. [T]allows a determination of an effective 
torsional persistence length £*. The coupling of torsion 
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and bending elasticity can be computed as 



(10) 



where J(0)^, = o and J(9)^o are the torsionally con- 
strained and unconstrained J factors, respectively. The 
effective torsional persistence length represents the con- 
version between twist and writhe. 

The effective torsional persistence length can be written 
as a torsional and bending spring in series 



1 



Y +a(9)l, 



(11) 



where all of the angular dependence is given by a. In fig. SI 
it is clear that a(9) has as simple quadratic dependence 
up until the Teardrop shape and afterwards becomes cubic 
in 9 



a(&) = -^-e 2 - -^9, o<e<o.557r, 

2tt z ott 6 



(12) 



a(G) = 0.429 3 - 2.559 2 + 5.469 - 3.87, 0.55tt < 9 < n 

(13) 

Conclusion. — We have developed a generalized ap- 
proach for computing J factors of arbitrary loop shapes, 
which may include sequence-dependent stiffness and cur- 
vature. We have shown that the J factor varies strongly 
for near planar loop shapes as a function of loop tangent 
angle 9 for intrinsically straight DNA with isotropic bend- 
ing stiffness. The in-plane J factors can be well fit with 
analytic functions for all 9. We have defined an effective 
torsional persistence length £* and subsequence torsion- 
bending coupling a(9) which are shown to vary signifi- 
cantly as a function of loop formation angle, 9. Finally, 
our calculation is computationally very quick, taking only 
a few minutes per J factor for any set of input boundary 
conditions. 
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